Multi-stream spectro-temporal features for robust speech recognition

نویسندگان

  • Sherry Y. Zhao
  • Nelson Morgan
چکیده

A multi-stream approach to utilizing the inherently large number of spectro-temporal features for speech recognition is investigated in this study. Instead of reducing the featurespace dimension, this method divides the features into streams so that each represents a patch of information in the spectrotemporal response field. When used in combination with MFCCs for speech recognition under both clean and noisy conditions, multi-stream spectro-temporal features provide roughly a 30% relative improvement in word-error rate over using MFCCs alone. The result suggests that the multi-stream approach may be an effective way to handle and utilize spectro-temporal features for speech applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-stream to many-stream: using spectro-temporal features for ASR

We report progress in the use of multi-stream spectro-temporal features for both small and large vocabulary automatic speech recognition tasks. Features are divided into multiple streams for parallel processing and dynamic utilization in this approach. For small vocabulary speech recognition experiments, the incorporation of up to 28 dynamically-weighted spectro-temporal feature streams along w...

متن کامل

Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain

This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...

متن کامل

Robust cortical entrainment to the speech envelope relies on the spectro-temporal fine structure

Speech recognition is robust to background noise. One underlying neural mechanism is that the auditory system segregates speech from the listening background and encodes it reliably. Such robust internal representation has been demonstrated in auditory cortex by neural activity entrained to the temporal envelope of speech. A paradox, however, then arises, as the spectro-temporal fine structure ...

متن کامل

Spectro-temporal modulations for robust speech emotion recognition

Speech emotion recognition is mostly considered in clean speech. In this paper, joint spectro-temporal features (RS features) are extracted from an auditory model and are applied to detect the emotion status of noisy speech. The noisy speech is derived from the Berlin Emotional Speech database with added white and babble noises under various SNR levels. The clean train/noisy test scenario is in...

متن کامل

Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition

The effect of bio-inspired spectro-temporal processing for automatic speech recognition (ASR) is analyzed for two different tasks with focus on the robustness of spectro-temporal Gabor features in comparison to mel-frequency cepstral coefficients (MFCCs). Experiments aiming at extrinsic factors such as additive noise and changes of the transmission channel were carried out on a digit classifica...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008